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. "quantum theory from five reasonable axioms." Here we show that Hardy's first 

axiom, which identifies probability with limiting frequency in an ensemble, is not 
necessary for his derivation. By reformulating Hardy's assumptions, and modifying 
a part of his proof, in terms of Bayesian probabilities, we show that his work can be 
easily reconciled with a Bayesian interpretation of quantum probability. 



In a recent paper [e-print quant-ph /0101012 1 , Hardy has given a derivation of 



I. INTRODUCTION 
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In Bayesian probability theory Jl|, ||, probabilities are not objective states of nature, 
but rather are taken to be degrees of belief that determine an agent's decisions in the face 
of uncertainty. It can be shown that degrees of belief must obey the usual rules of the 
Qh! probability calculus if the agent's decisions are rational (for references and a summary of 
the argument, see ||). In a Bayesian framework, probabilities and measured frequencies are 
strictly separate concepts. This leads to conceptual clarity in statements that involve both 
probabilities and frequencies. Furthermore, adopting the Bayesian viewpoint has important 
practical consequences in the field of statistics 0, f|. 

If the Bayesian interpretation is applied to quantum mechanical probabilities, one is led 
^ ■ naturally to the viewpoint that quantum states represent states of belief. This viewpoint 
is attractive for many reasons. For instance, it eliminates the difficulties associated with 
regarding quantum state collapse as a real physical process. Within the Bayesian framework, 
one can account effortlessly for the tight connection between measured frequencies and the 
probabilities obtained from the quantum probability rule || . The Bayesian approach has led 
to new mathematical results 0, § , a better understanding of prior information in quantum 
tomography |7j], and an optimized entanglement purification protocol ||. 



Hardy || (see also [ 10 1 ) has recently given a derivation of the mathematical structure 
of quantum theory from five simple axioms. In his first axiom, Hardy identifies probability 
with measured frequency in the limit of an infinite number of repetitions of a given experi- 
ment. In Hardy's formulation, a quantum state is a property of a preparation device. This 
is a problematical notion. Attempts to base probability theory on a definition of probability 



as frequency in infinite ensembles |TTJ have largely failed (see, e.g., [ |T2| , |T"3||). For instance, 
without further complicating assumptions, a relative frequency specified for an infinite en- 
semble does not in any way restrict the corresponding frequency for a finite subensemble. 
Furthermore, attaching the notion of a quantum state to a preparation device appears to 
limit quantum theory to the description of laboratory experiments. But surely one would 
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want to assign a quantum state, e.g., to a light pulse arriving from a distant star. 

The details of Hardy's mathematical proof turn out to be mostly independent of the 
specific assumptions of his first axiom. In the present paper, we show that it is indeed 
possible to reformulate Hardy's derivation in such a way that the axioms refer to Bayesian 
probabilities for the outcomes of measurements performed on a single physical system (see 
also Hardy's remarks at the end of section 6.1 of ||). In Sec. 0, we briefly review Hardy's 
basic setup and axioms. In Sec. [ITT], we provide a Bayesian reformulation of the problem 
and explain how Hardy's proof can be modified accordingly. In Sec. IVL we conclude by 
showing how the connection between probabilities and measured frequencies is recovered in 
our formulation. 



II. HARDY'S SETUP 

In ||, Hardy considers the following situation. An experimenter has a preparation device, 
a transformation device, and a measurement device. Associated with each preparation is a 
state, "defined to be (that thing represented by) any mathematical object that can be used 
to determine the probability associated with the outcomes of any measurement that may be 
performed on a system prepared by the given preparation." If a physical system is incident 
on the measurement device, it outputs a number I, where I — 1, . . . , L. If no physical system 
is incident on the measurement device, it outputs the number 0. 

Hardy then defines a probability measurement in the following way. A given measurement 
is performed on an ensemble of n systems each prepared by a given preparation device. Then 
the number of times, n+, is recorded that a particular outcome h, or subset of outcomes 
Si C {1, . . . , L}, is observed. The measured probability is then defined as 

prob , = lim — . (1) 

' n— >oo 

It is then assumed that there exists a minimum number, K, of appropriately chosen prob- 
ability measurements that completely specify the state of the system. These K probabilities 
can be represented by a column vector 

P= (PuP2,---,Pk) T , (2) 

which represents the state. The result of any probability measurement can be inferred from 
the vector p. The number K is called the number of degrees of freedom of the system. It 
follows from the axioms below that the set of states is convex; pure states are defined as the 
extremal points of this convex set. 

Finally, the dimension, N, of the system is defined as the maximum number of states 
that can be distinguished reliably in a single-shot measurement. Using these terms, Hardy 
derives the usual Hilbert-space formulation of quantum theory from the following five axioms, 
quoted verbatim from ||. 

Axiom 1 Probabilities. Relative frequencies (measured by taking the proportion of times a 
particular outcome is observed) tend to the same value (which we call the probability) 
for any case where a given measurement is performed on a ensemble of n systems 
prepared by some given preparation in the limit as n becomes infinite. 
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Axiom 2 Simplicity. K is determined by a function of N (i.e. K = K(N)) where N = 
1,2,... and where, for each given N, K takes the minimum value consistent with the 
axioms. 

Axiom 3 Subspaces. A system whose state is constrained to belong to an M dimensional 
subspace (i.e. have support on only M of a set of N possible distinguishable states) 
behaves like a system of dimension M. 

Axiom 4 Composite systems. A composite system consisting of subsystems A and B sat- 
isfies N = N A N B and K = K A K B . 

Axiom 5 Continuity. There exists a continuous reversible transformation on a system 
between any two pure states of that system. 

These axioms are stated in a manifestly frequentist language. Axiom 1 defines prob- 
ability in terms of limiting frequency, and the number of degrees of freedom K, defined 
explicitly in terms of frequency measurements, has a central position in both axioms 2 and 
4. Nevertheless, a Bayesian formulation of Hardy's program turns out to be straightforward. 



III. THE BAYESIAN SETUP 

The Bayesian setup we are about to describe differs from Hardy's setup in the following 
ways. In the Bayesian formulation, it will not be necessary to refer to preparation devices 
or ensembles. Everything is expressed in terms of single physical systems. The concept of 
a probability measurement is not needed (see || for a Bayesian account of what it means 
to effectively measure a quantum probability in a laboratory experiment). Axiom 1 can be 
eliminated. 

Our primitives are physical systems, transformation devices and measurement devices. 
As before, the non-null outcomes of a measurement device are labeled / = 1, . . . , L. We 
define a special class of measurements, so-called yes-no measurements, that have only two 
outcomes, which we label yes and no. E.g., for a given measurement device, the questions 
"is the outcome equal to Zi?" and "is the outcome in the set Si C {1, ... , L}?" define yes-no 
measurements. 

The state of a system is now defined to be any mathematical object that summarizes a 
physicist's state of belief about a system in that it can be used to determine the probabilities 
associated with the outcomes of any measurement that may be performed on the system. 
In this definition, probability means the physicist's degree of belief about the outcome of 
a measurement performed on a single system. Degrees of belief acquire an operational 
definition in decision theory and can be shown to obey the usual probability rules (see 
for details and references). 

We now assume that there exists a number of yes-no measurements such that the proba- 
bilities for their outcomes determine the state fully. Let K be the minimum number of such 
yes-no measurements, and fix a set of K such measurements, the fiducial measurements. 
As before, the state is then given by the probabilities assigned to the yes outcomes of the 
fiducial measurements, i.e., by a vector p = (pi, . . . ,Pk) t ■ For any yes-no measurement 
there exists a function / that maps any state p to the probability for the yes outcome if the 
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measurement is performed on a system to which p is assigned. This can be expressed as 

Pr(yes) = /(p) . (3) 

Finally, as before, the dimension, N, of the system is defined as the maximum number of 
states that can be distinguished reliably in a single measurement. 

It turns out that most parts of Hardy's proof are unaffected by our reformulation . 



Wherever Hardy refers to a probability measurement, we refer instead to "probability as- 
signed to the yes outcome of a yes-no measurement" . The only exception is the part of the 
proof that uses axiom 1 explicitly, i.e., sections 6.4 and 6.5 of ||. 

Section 6.4 of [f| introduces the function / defined above, and derives the inequality 
< /(p) — 1 from the assumption that probabilities are measured frequencies. We get this 
inequality from the assumption that the probabilities assigned by a physicist are degrees 
of belief and therefore obey the laws of probability. Since /(p) = Pr(yes), it follows that 
< /(p) < 1. It is worth pointing out that the Bayesian derivation of this inequality does 
not depend on the notion of repeated trials and is therefore completely independent of the 
frequentist derivation [[|. 

Section 6.5 of introduces the idea of a mixture of two quantum states, which is then 
used to derive linearity of the quantum probability rule and quantum transformations. Hardy 
defines a mixture as an ensemble consisting of a fraction A of systems prepared in a state 
and a fraction 1 — A of systems prepared in a state p#. This construction cannot be used in 
our Bayesian approach, which refers only to a single system, not a large ensemble of systems. 
In particular, in the Bayesian approach, the mixing parameter A cannot be interpreted as a 
limiting frequency of systems prepared in a particular way. A different proof of linearity is 
therefore required. 

Our alternative derivation of linearity is based on the idea of conditioning, which is central 
to Bayesian theory in general. Assume that p,4 and p# are possible states for a given system. 
Then we can imagine a situation in which a physicist's state assignment depends on some 
event E. The event E could be the outcome of a previous measurement, or some other piece 
of information that affects his state assignments. If he knew that E was true, he would 
make the state assignment p^, and if he knew that ->E was true, he would make the state 
assignment p#. We now assume that he does not know the truth value of E. Instead, he 
assigns the probabilities Pr(E) = A and Pt(->E) = 1 — A to the events E and ->E, and makes 
the state assignment pc- 

If we now apply the function / for a given yes- no measurement to the state p^, we obtain 
the conditional probability for the outcome yes, given that E is true, 

f(p A ) = Pr(yes|£) . (4) 

Applying / to the state p# gives the conditional probability for the outcome yes, given that 
->E is true, 

/(p fl ) = Pr(yeshE) . (5) 
Finally, applying / to the state pc gives the unconditional probability for the outcome yes, 

/(p c ) = Pr(yes) . (6) 

Since we have assumed that the physicist's probability assignments are Bayesian degrees of 
belief, they must obey the usual probability rules (see above). In particular, they obey the 
law of total probability, 

Pr(yes) = Pr(yes|£) Pi(E) + Pr(yes|-i£) Pr(^E) . (7) 
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By substituting Eqs. (f§-§D an d the definition of A, we obtain 

f(pc) = A/(pa) + (1 - A)/(p B ) • (8) 

This is the same equation that Hardy derives in section 6.5 of [[J. Following Hardy, we can 
now apply Eq. @ to the K fiducial measurements. For the fc-th fiducial measurement, f(p) 
is the k-th component of p. Writing the K resulting equations in vector form, we obtain 

Pc = Apa + (1 - A)p B , (9) 

which can be combined with Eq. (H) to give 

f(Xp A + (1 - A)p B ) = A/(p A ) + (1 - A)/( Pfl ) • (10) 

This establishes convex linearity of the function /. For a different Bayesian derivation of 
Eq. (|T0D, see []15 |, which builds on the theory of quantum Bayesian updating introduced in 
FT 



IV. DISCUSSION 

In the previous section, we have seen that most of Hardy's derivation of quantum theory 
remains valid if probabilities are given a Bayesian interpretation. In the Bayesian formu- 
lation, Hardy's frequentist axiom 1 can be omitted. Linearity now follows from the basic 
setup, where quantum states are defined as compendia of probabilities for the outcomes of 
arbitrary single-shot yes-no measurements. In this sense, quantum theory can be derived 
from the last four of Hardy's five axioms. 

It may seem, however, that something important is lost in the Bayesian approach. Hardy's 
version of quantum theory makes statements about actual frequencies measured in a labo- 
ratory, which are conspicuously absent from the Bayesian formulation given above. We will 
now review an almost trivial argument that establishes a tight connection between Bayesian 
quantum state assignments and measured frequencies. 

Suppose an experiment consisting of the preparation of a system and a subsequent yes-no 
measurement is repeated n times. Assume that the experimenter assigns the n-fold tensor 
product state 

p® n = p® p®---® p (11) 

to the n copies of the system, where p is a single-system density operator. Suppose the 
single-system measurement is described by the projection operators P yes and P no = 1 — P yes . 
The probability for yes in the first measurement is then q = Pr(yes) = tr(pP yes ). The 
probability for k yes outcomes and n — k no outcomes in n repetitions of the experiment is 
easily found to be 

P rW =@ 5 *(l-,)- (12) 

which for large n is strongly peaked near k/n = q. The probability that the measured fre- 
quency is near q approaches 1 as n tends to infinity. The Bayesian starting point of regarding 
probability and measured frequency as two separate concepts thus leads to a transparent 
and tight connection between quantum states and measured frequencies. Nothing is lost by 
abandoning the a priori identification of probabilities with measured frequencies. 
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